- Home
- Search Results
- Page 1 of 1
Search for: All records
-
Total Resources3
- Resource Type
-
0002100000000000
- More
- Availability
-
30
- Author / Contributor
- Filter by Author / Creator
-
-
Vinciguerra, Giorgio (3)
-
Abrar, Md Hasin (2)
-
Medvedev, Paul (2)
-
Dunne, Cody (1)
-
Gatterbauer, Wolfgang (1)
-
Yang, Guang (1)
-
#Tyler Phillips, Kenneth E. (0)
-
#Willis, Ciara (0)
-
& Abreu-Ramos, E. D. (0)
-
& *Soto, E. (0)
-
& Abramson, C. I. (0)
-
& Abreu-Ramos, E. D. (0)
-
& Adams, S.G. (0)
-
& Ahmed, K. (0)
-
& Ahmed, Khadija. (0)
-
& Aina, D.K. Jr. (0)
-
& Akcil-Okan, O. (0)
-
& Akuom, D. (0)
-
& Aleven, V. (0)
-
& Andrews-Larson, C. (0)
-
- Filter by Editor
-
-
Benoit, Anne (1)
-
Herman, Grzegorz (1)
-
Kaplan, Haim (1)
-
Wild, Sebastian (1)
-
& Spizer, S. M. (0)
-
& . Spizer, S. (0)
-
& Ahn, J. (0)
-
& Bateiha, S. (0)
-
& Bosch, N. (0)
-
& Brennan K. (0)
-
& Brennan, K. (0)
-
& Chen, B. (0)
-
& Chen, Bodong (0)
-
& Drown, S. (0)
-
& Ferretti, F. (0)
-
& Higgins, A. (0)
-
& J. Peters (0)
-
& Kali, Y. (0)
-
& Ruiz-Arias, P.M. (0)
-
& S. Spitzer (0)
-
-
Have feedback or suggestions for a way to improve these results?
!
Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Benoit, Anne; Kaplan, Haim; Wild, Sebastian; Herman, Grzegorz (Ed.)Data structures on a multiset of genomic k-mers are at the heart of many bioinformatic tools. As genomic datasets grow in scale, the efficiency of these data structures increasingly depends on how well they leverage the inherent patterns in the data. One recent and effective approach is the use of learned indexes that approximate the rank function of a multiset using a piecewise linear function with very few segments. However, theoretical worst-case analysis struggles to predict the practical performance of these indexes. We address this limitation by developing a novel measure of piecewise-linear approximability of the data, called CaPLa (Canonical Piecewise Linear approximability). CaPLa builds on the empirical observation that a power-law model often serves as a reasonable proxy for piecewise linear-approximability, while explicitly accounting for deviations from a true power-law fit. We prove basic properties of CaPLa and present an efficient algorithm to compute it. We then demonstrate that CaPLa can accurately predict space bounds for data structures on real data. Empirically, we analyze over 500 genomes through the lens of CaPLa, revealing that it varies widely across the tree of life and even within individual genomes. Finally, we study the robustness of CaPLa as a measure and the factors that make genomic k-mer multisets different from random ones.more » « less
-
Abrar, Md Hasin; Medvedev, Paul; Vinciguerra, Giorgio (, Schloss Dagstuhl – Leibniz-Zentrum für Informatik)
-
Vinciguerra, Giorgio; Yang, Guang; Gatterbauer, Wolfgang; Dunne, Cody (, ACM)
An official website of the United States government

Full Text Available